NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

A survey of computational approaches for characterizing microbial interactions in microbial mats

https://doi.org/10.1186/s13059-025-03634-2

Perillo, Vanesa_L; Nute, Michael; Sapoval, Nicolae; Curry, Kristen_D; Golia, Logan; Yin, Yongze; Ogilvie, Huw_A; Nakhleh, Luay; Segarra, Santiago; Bhaya, Devaki; et al (June 2025, Genome Biology)
Parsnp 2.0: scalable core-genome alignment for massive microbial datasets

https://doi.org/10.1093/bioinformatics/btae311

Kille, Bryce; Nute, Michael G; Huang, Victor; Kim, Eddie; Phillippy, Adam M; Treangen, Todd J (May 2024, Bioinformatics)
Schwartz, Russell (Ed.)
Abstract MotivationSince 2016, the number of microbial species with available reference genomes in NCBI has more than tripled. Multiple genome alignment, the process of identifying nucleotides across multiple genomes which share a common ancestor, is used as the input to numerous downstream comparative analysis methods. Parsnp is one of the few multiple genome alignment methods able to scale to the current era of genomic data; however, there has been no major release since its initial release in 2014. ResultsTo address this gap, we developed Parsnp v2, which significantly improves on its original release. Parsnp v2 provides users with more control over executions of the program, allowing Parsnp to be better tailored for different use-cases. We introduce a partitioning option to Parsnp, which allows the input to be broken up into multiple parallel alignment processes which are then combined into a final alignment. The partitioning option can reduce memory usage by over 4× and reduce runtime by over 2×, all while maintaining a precise core-genome alignment. The partitioning workflow is also less susceptible to complications caused by assembly artifacts and minor variation, as alignment anchors only need to be conserved within their partition and not across the entire input set. We highlight the performance on datasets involving thousands of bacterial and viral genomes. Availability and implementationParsnp v2 is available at https://github.com/marbl/parsnp.
more » « less
Full Text Available
Microbial Community Profiling Protocol with Full‐length 16S rRNA Sequences and Emu

https://doi.org/10.1002/cpz1.978

Curry, Kristen D.; Soriano, Sirena; Nute, Michael G.; Villapol, Sonia; Dilthey, Alexander; Treangen, Todd J. (March 2024, Current Protocols)

Abstract 16S rRNA targeted amplicon sequencing is an established standard for elucidating microbial community composition. While high‐throughput short‐read sequencing can elicit only a portion of the 16S rRNA gene due to their limited read length, third generation sequencing can read the 16S rRNA gene in its entirety and thus provide more precise taxonomic classification. Here, we present a protocol for generating full‐length 16S rRNA sequences with Oxford Nanopore Technologies (ONT) and a microbial community profile with Emu. We select Emu for analyzing ONT sequences as it leverages information from the entire community to overcome errors due to incomplete reference databases and hardware limitations to ultimately obtain species‐level resolution. This pipeline provides a low‐cost solution for characterizing microbiome composition by exploiting real‐time, long‐read ONT sequencing and tailored software for accurate characterization of microbial communities. © 2024 Wiley Periodicals LLC. Basic Protocol: Microbial community profiling with Emu Support Protocol 1: Full‐length 16S rRNA microbial sequences with Oxford Nanopore Technologies sequencing platform Support Protocol 2: Building a custom reference database for Emu
more » « less
Full Text Available
Bakdrive: identifying a minimum set of bacterial species driving interactions across multiple microbial communities

https://doi.org/10.1093/bioinformatics/btad236

Wang, Qi; Nute, Michael; Treangen, Todd J. (June 2023, Bioinformatics)

Abstract MotivationInteractions among microbes within microbial communities have been shown to play crucial roles in human health. In spite of recent progress, low-level knowledge of bacteria driving microbial interactions within microbiomes remains unknown, limiting our ability to fully decipher and control microbial communities. ResultsWe present a novel approach for identifying species driving interactions within microbiomes. Bakdrive infers ecological networks of given metagenomic sequencing samples and identifies minimum sets of driver species (MDS) using control theory. Bakdrive has three key innovations in this space: (i) it leverages inherent information from metagenomic sequencing samples to identify driver species, (ii) it explicitly takes host-specific variation into consideration, and (iii) it does not require a known ecological network. In extensive simulated data, we demonstrate identifying driver species identified from healthy donor samples and introducing them to the disease samples, we can restore the gut microbiome in recurrent Clostridioides difficile (rCDI) infection patients to a healthy state. We also applied Bakdrive to two real datasets, rCDI and Crohn's disease patients, uncovering driver species consistent with previous work. Bakdrive represents a novel approach for capturing microbial interactions. Availability and implementationBakdrive is open-source and available at: https://gitlab.com/treangenlab/bakdrive.
more » « less
Multiple genome alignment in the telomere-to-telomere assembly era

https://doi.org/10.1186/s13059-022-02735-6

Kille, Bryce; Balaji, Advait; Sedlazeck, Fritz J.; Nute, Michael; Treangen, Todd J. (December 2022, Genome Biology)

Abstract With the arrival of telomere-to-telomere (T2T) assemblies of the human genome comes the computational challenge of efficiently and accurately constructing multiple genome alignments at an unprecedented scale. By identifying nucleotides across genomes which share a common ancestor, multiple genome alignments commonly serve as the bedrock for comparative genomics studies. In this review, we provide an overview of the algorithmic template that most multiple genome alignment methods follow. We also discuss prospective areas of improvement of multiple genome alignment for keeping up with continuously arriving high-quality T2T assembled genomes and for unlocking clinically-relevant insights.
more » « less
Full Text Available
KOMB: K-core based de novo characterization of copy number variation in microbiomes

https://doi.org/10.1016/j.csbj.2022.06.019

Balaji, Advait; Sapoval, Nicolae; Seto, Charlie; Leo Elworth, R.A.; Fu, Yilei; Nute, Michael G.; Savidge, Tor; Segarra, Santiago; Treangen, Todd J. (January 2022, Computational and Structural Biotechnology Journal)

Full Text Available
Emu: species-level microbial community profiling of full-length 16S rRNA Oxford Nanopore sequencing data

https://doi.org/10.1038/s41592-022-01520-4

Curry, Kristen D.; Wang, Qi; Nute, Michael G.; Tyshaieva, Alona; Reeves, Elizabeth; Soriano, Sirena; Wu, Qinglong; Graeber, Enid; Finzer, Patrick; Mendling, Werner; et al (July 2022, Nature Methods)

Full Text Available
Long-Branch Attraction in Species Tree Estimation: Inconsistency of Partitioned Likelihood and Topology-Based Summary Methods

https://doi.org/10.1093/sysbio/syy061

Roch, Sebastien; Nute, Michael; Warnow, Tandy; Kubatko, Laura (September 2018, Systematic Biology)

Full Text Available

Search for: All records